Method of multiple-antenna wireless communication using space-time codes

ABSTRACT

In a method of wireless transmission, space time matrices are used to spread the transmission of data over two or more transmit antennas and/or over two or more symbol intervals. Initially, blocks of data are encoded as symbols, each being a complex amplitude selected from a symbol constellation. A finite set of space-time matrices, referred to as “dispersion matrices,” is predetermined. In transmission, a group of symbols are transmitted concurrently. Each of the symbols to be transmitted is multiplied by a respective dispersion matrix. Thus, a composite matrix, proportional to a sum of dispersion matrices multiplied by their corresponding symbols, is modulated onto a carrier and transmitted. In reception, knowledge of the dispersion matrices is used to recover the transmitted symbols from the received signals corresponding to the composite matrix that was transmitted.

This application claims benefit of U.S. Provisional Ser. No. 60/224,685filed Aug. 11, 2000.

FIELD OF THE INVENTION

This invention relates to wireless radiofrequency communication, andmore specifically, to methods of signal modulation for communicationusing multiple-antenna arrays.

ART BACKGROUND

In wireless communication, certain advantages are offered by the use ofmultiple antenna elements for transmission, whether with one or withmore than one receiving antenna element. These advantages include thepotential to mitigate fading effects, and the potential to increase datatransmission rates in a propagation channel of given characteristics.

A variety of schemes have been proposed for modulating data to betransmitted from a multiple-element array. In some of these schemes,referred to generally as space-time modulation, the data are transmittedin the form of codewords distributed in space—i.e., across the antennaarray—and in time. Such a codeword comprises a plurality ofcomplex-valued amplitudes modulated onto a carrier wave.

Within a given time interval, referred to as a symbol interval, acomplex amplitude (which might be zero) is transmitted from each elementof the antenna array. Conversely, at each element of the array, asequence of amplitudes is transmitted over a succession of symbolintervals. The concurrent transmission of amplitudes from the elementsof the array during one symbol interval is referred to as a channel use.

A codeword of the kind described above can be represented by a matrix.The respective entries of the matrix are proportional to the complexamplitudes to be transmitted. Each column of the matrix corresponds,e.g., to a respective transmitting antenna, and each row corresponds,e.g., to a respective symbol interval.

A variety of schemes have also been proposed for recovering thetransmitted data from signals received by a single receiving antenna ora multiple-element receiving antenna array. Mathematical models of thepropagation channel between the transmitting and receiving antennasgenerally include a matrix of channel coefficients, each suchcoefficient relating the amplitude received at a given element of thereceiving array to the amplitude transmitted from a given element of thetransmitting array. In some of the known reception schemes, the channelcoefficients are assumed to be known, exemplarily from measurements madeusing pilot signals.

When the channel coefficients are known, methods of signal recovery canbe used that effectively invert the channel matrix. Both direct andindirect methods are known for effectively inverting the channel matrix.Among the indirect methods are Maximum Likelihood (ML) detectors. Givenan estimate of the channel matrix and a received signal, an ML detectorcomputes a likelihood score for each of a plurality of candidatecodewords, and selects that candidate codeword that yields the highestscore. Because of noise and uncertainties in the channel coefficientsdue to fading, received signals are generally corrupted to a greater orlesser extent. Thus, it is advantageous to use codewords for which thelikelihood scores have high discriminating power, even in the presenceof fading and noise.

One known method of space time modulation is V-BLAST. In V-BLAST, aninitial stream of data is apportioned into separate sequences ofamplitudes, each of which is independently transmitted from one of thetransmitting antenna elements. In effect, the codeword can berepresented by a row vector having M entries, where M is the number oftransmitting antennas. The single row represents a single symbolinterval. Typically, a new codeword is transmitted in each symbolinterval. The independent sequence of amplitudes transmitted by eachantenna can be referred to as a substream because it contains arespective subset of the data in the initial data stream.

Several schemes have been described for recovering V-BLAST signals. Somesuch schemes use ML detectors. According to another such scheme, theentries of the transmitted vector are recovered one-by-one, with eachsuccessive recovery utilizing the results of the previous recoveries.One example of such a scheme is described in the co-pending U.S. patentapplication Ser. No. 09/438,900, filed Nov. 12, 1999 by B. Hassibi underthe title “Method and Apparatus for Receiving Wireless TransmissionsUsing Multiple-Antenna Arrays,” and commonly assigned herewith.

V-BLAST is advantageous in that it can be used for communication atrelatively high data rates without excessive computational complexity inthe decoding of the received signals. However, the decoding schemes thatoffer the lowest complexity require that the number N of receivingantennas must equal or exceed the number M of transmitting antennas.Such a requirement is disadvantageous when, for example, a largeinstallation such as a base station is transmitting to a smallinstallation such as a hand-held mobile wireless terminal.

Another method of space time modulation is described in S. M. Alamouti,“A simple transmitter diversity scheme for wireless communications,”IEEE J. Sel. Area Comm. (October 1998) 1451-1458. In the Alamoutischeme, each codeword is distributed over two transmit antennas and twosymbol intervals. Each codeword is determined by two distinct complexamplitudes, each belonging to a respective substream. In the firstsymbol interval, one of the amplitudes is transmitted from the firstantenna, and the other amplitude is transmitted from the second antenna.In the second symbol interval, the complex amplitudes are interchangedbetween the two antennas, one of the complex amplitudes changes sign,and the complex conjugates of the resulting amplitudes are transmitted.Significantly, when a codeword of this kind is expressed in the form ofa matrix, the matrix has orthogonal columns.

One drawback of the Alamouti scheme is that it makes the most efficientuse of the theoretical information capacity of the propagation channelonly when there is a single receiving antenna. The channel capacity isused less efficiently when further receiving antennas are added. Thus,gains that might otherwise be expected in data rate and fadingresistance from multiple-antenna receiving arrays are not fullyrealized.

Extensions of the Alamouti scheme to more than two transmitting antennasand more than two symbol intervals per codeword are also known. TheAlamouti scheme and its extensions are referred to generally asorthogonal designs because the matrices that represent the codewords arerequired to be orthogonal; that is, each column of such a matrix isorthogonal to every other column of the matrix. A further requirement oforthogonal designs is that for a matrix to represent a codeword, allcolumns of the matrix must have the same energy. In this regard, the“energy” of a column is the scalar product of that column with itscomplex conjugate.

Until now, there has been an unmet need for a space-time modulationscheme that can handle high data rates with relatively low decodingcomplexity and that uses the potentially available channel capacity withrelatively high efficiency for any combination (M, N) of transmissionand reception antennas.

SUMMARY OF THE INVENTION

We have invented such a scheme. Our scheme uses space-time matrices tospread the transmission of data over two or more transmit antennasand/or over two or more symbol intervals. (A “matrix” in this regard mayconsist of a single column or a single row.) Initially, blocks of dataare encoded as complex amplitudes selected from a finite set of suchamplitudes. (Complex values include those that are pure real and pureimaginary.) We refer to, each selected complex amplitude as a “symbol,”and we refer to the finite set as a “constellation.” If theconstellation has r elements, then each symbol carries log₂ r bits ofinformation.

The constellation is predetermined, and also a fixed, finite set ofspace-time matrices is predetermined. We refer to the matrices in thisset as “dispersion matrices.” In the following discussion of anexemplary embodiment of the invention, we let Q represent the number ofdispersion matrices in the fixed, finite set.

In transmission, Q symbols are transmitted concurrently. Each of the Qsymbols to be transmitted is multiplied by a respective dispersionmatrix. A composite matrix, proportional to the sum of all Q dispersionmatrices multiplied by their corresponding symbols, is transmittedaccording to the principles of space-time modulation described above.

The elements of the dispersion matrices are advantageously selectedaccording to a procedure that seeks to drive the rate at which data canbe sent and received toward the information-theoretic channel capacity.

We have found that orthogonal designs fall significantly short ofachieving the information-theoretic capacity of the channel wheneverthere are more than two transmitting antennas or more than one receivingantenna.

In contrast to transmission methods using orthogonal designs, our methoddoes not constrain the columns of the transmitted, composite matrix tobe orthogonal or to have equal energies.

In reception, knowledge of the dispersion matrices is used to recoverthe Q symbols from the received signals corresponding to the compositematrix that was transmitted.

In a broader aspect of the invention, the total number of dispersionmatrices is 2Q. Half of the dispersion matrices are used to spread thereal parts of the Q symbols, and the other half are used to spread theimaginary parts of the Q symbols. Alternatively, half of the dispersionmatrices are used to spread the Q symbols, and the other half are usedto spread the complex conjugates of the Q symbols.

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 is a conceptual drawing of a multiple-antenna wirelesscommunication system of the prior art.

FIG. 2 is a conceptual diagram showing the operation of lineardispersion coding according to the invention in some embodiments.

FIG. 3 is a flowchart illustrating the application of a criterion on themaximality of mutual information for designing a linear dispersion codeaccording to the invention in some embodiments.

FIG. 4 is a graph showing the theoretical performance of certain codingtechniques in terms of bit-error rate (ber) or block-error rate (bler)as a function of signal-to-noise ratio (SNR). Compared in the figure arethe results of theoretical modeling of a linear dispersion code (solidcurves labeled “LD Code”) and an orthogonal design (broken curveslabeled “OD”).

DETAILED DESCRIPTION

Certain general features of space-time modulation will now be describedwith reference to FIG. 1. Let there be M transmit antennas 10.1-10.M,and N receive antennas 15.1-15.M. Let the propagation channel bereasonably well modeled as a narrow-band, flat-fading channel that iseffectively constant and known to the receiver for a duration whoselength is at least T symbol intervals. The transmitted signal can thenbe written as a T×M matrix S that governs the transmission over the Mantennas during the interval.

Illustrated schematically in FIG. 1 is the transmission of the first rowof the signal matrix S. During the first of T symbol intervals, thecomplex amplitude S₁₁ is modulated onto a radiofrequency carrier andtransmitted from antenna 10.1, and each of the remaining complexamplitudes S₁₂, . . . , S_(1M) is modulated onto the carrier andtransmitted from a corresponding antenna 10.2, . . . , 10.M.

At the receiving end, all of the transmitted amplitudes are interceptedby each of the N receiving antennas 15.1-15.N, with varying attenuationsand phase delays determined by the characteristics of the propagationchannel, which is described by the matrix H of channel coefficients.Thus, after demodulation to baseband, the signal from each receivingantenna resulting from each channel use is a linear combination of theamplitudes S₁₂, . . . , S_(1M), with complex weights determined by thepropagation channel, plus additive noise. The outputs over T symbolintervals, corresponding to the response of the receiver to thetransmission of matrix S, can be represented as a T×N matrix X+V, whereX contains the linear combinations described above, and V contains theadditive noise. Illustrated schematically in FIG. 1 is the receipt ofthe first row of matrix X+V.

Certain features of the present invention will now be described withreference to FIG. 2. The steps shown in FIG. 2 are merely illustrative.Those skilled in the art will appreciate that numerous alternativeprocedures will bring about equivalent results, and thus fall within thescope and spirit of the present invention.

A sequence 20 of data, exemplarily a binary sequence of 0's and 1's, isparsed into substreams. In the example shown, the number Q of substreamsis 3, and each block of data in a substream carries three bits ofinformation. A block 25 of data from each substream is mapped to asymbol 30 selected from constellation 35. The illustrative constellationshown in FIG. 2 is a set of eight uniformly spaced points on the unitcircle in the complex plane. More typically, the constellation will bean r-PSK or r-QAM constellation.

In the example shown, the image of each block of data is a respectiveone of the symbols s₁,s₂,s₃. Each of these symbols directly multiplies arespective dispersion matrix A₁,A₂,A₃. In process 40, the complexconjugate is taken of each symbol, thus generating a further symbol.Each of the resulting complex conjugates multiplies a respectivedispersion matrix B₁, B₂, B₃. In process 45, which is represented in thefigure as a summation element, the signal matrix S is constructed bysumming the six dispersion matrices, with each weighted by itscorresponding symbol.

More generally, Q symbols s₁, . . . , s_(Q) are selected from anappropriate constellation. The signal matrix S is constructed accordingto: $\begin{matrix}{{S = {\sum\limits_{q = 1}^{Q}\quad\left( {{\alpha_{q}A_{q}} + {j\quad\beta_{q}B_{q}}} \right)}},} & (1)\end{matrix}$wheres _(q)=α_(q) +jβ _(q) , q=1, . . . Q.   (2)

We refer to a code of this kind as a rate R=(Q/T)log₂r linear dispersion(LD) code.

The code is completely specified by the fixed T×M complex matrices A₁, .. . , A_(Q) and B₁, . . . , B_(Q), which we refer to as dispersionmatrices. Each individual codeword is determined by the scalars {s₁, . .. , s_(Q)}.

Alternatively, S is expressed by: $\begin{matrix}{{S = {\sum\limits_{q = 1}^{Q}\quad\left( {{s_{q}C_{q}} + {s_{q}^{*}D_{q}}} \right)}},} & (3)\end{matrix}$where the C_(q) and D_(q) are the fixed T×M dispersion matrices.

In specific implementations, one or more of the A_(q) or B_(q), or oneor more of the C_(q) or D_(q), matrices could be zero. In fact, it isessential only that there be at least Q non-zero dispersion matrices.

As noted above, in a narrow-band, flat-fading, multi-antennacommunication system with M transmit and N receive antennas, thetransmitted and received signals are related by a linear relationship.We here represent that relationship by: $\begin{matrix}{{x = {{\sqrt{\frac{\rho}{M}\quad}{Hs}} + \upsilon}},} & (4)\end{matrix}$where the complex N-dimensional vector x denotes the vector of complexreceived signals during any given channel use, the complex M-dimensionalvector s denotes the vector of complex transmitted signals, the complexN×M matrix H denotes the channel matrix, and the complex N-dimensionalvector v denotes additive noise which, for purposes of theoreticalanalysis, is assumed to be spatially and temporally white; i.e., to beCN (0,1) (zero-mean, unit-variance, complex-Gaussian) distributed. Foranalytical purposes, the channel matrix H and transmitted vector s areassumed to have unit variance entries, implying that E tr HH*=MN andEs*s=M, where E (.) denotes the statistical expected value.

Assuming that the quantities H, s, and υ are random and independent, thenormalization $\sqrt{\frac{\rho}{M}\quad}$in Eq. (4) will insure that ρ is the signal-to-noise ratio (SNR) at thereceiver independently of M. For analytical purposes, it is also often(although not invariably) assumed that the channel matrix H also has CN(0,1) entries.

The entries of the channel matrix are assumed to be known to thereceiver but not to the transmitter. This assumption is reasonable iftraining or pilot signals are sent to learn the channel, which is thenconstant for some coherence interval. The coherence interval of thechannel is preferably large compared to M.

When the channel is effectively constant for at least T channel uses wemay write for each symbol interval t, $\begin{matrix}{{x_{t} = {{\sqrt{\frac{\rho}{M}}{Hs}_{t}} + \upsilon_{t}}},\quad{t = 1},\ldots,T,} & (5)\end{matrix}$so that defining X^(T)=[x₁x₂. . . x_(T)], S^(T)=[s₁s₂. . . s_(T)] andV^(T)=[υ₁υ₂. . . υ_(T)], we obtain $\begin{matrix}{X^{T} = {{\sqrt{\frac{\rho}{M}}{HS}^{T}} + {V^{T}.}}} & (6)\end{matrix}$It is generally more convenient to write this equation in its transposedform $\begin{matrix}{{X = {{\sqrt{\frac{\rho}{M}}{SH}} + V}},} & (7)\end{matrix}$where we have omitted the transpose notation from H and simply redefinedthis matrix to have dimension M×N. The complex T×N matrix X is thereceived signal, the complex T×M matrix S is the transmitted signal, andthe complex T×N matrix V is the additive CN (0,1) noise. In X, S, and V,time runs vertically and space runs horizontally.

We note that, in general, the number of T×M matrices S needed in acodebook can be quite large. If the rate in bits/channel use is denotedR, then the number of matrices is 2^(RT). For example, with M=4 transmitand N=2 receive antennas the channel capacity at ρ=20 dB (with CN (0,1)distributed H) is more than 12 bits/channel use. Even with a relativelysmall block size of T=4, we need 2⁴⁸≈10¹⁴ matrices at rate R=12.

LD codes can readily generate the very large constellations that areneeded. Moreover, because of their structure, they also allow efficientreal-time decoding.

Decoding. An important property of the LD codes is their linearity inthe variables {α_(q),β_(q)}, leading to efficient decoding schemes suchas those used in connection with V-BLAST. To see this, it is useful towrite the block equation $\begin{matrix}{X = {{{\sqrt{\frac{\rho}{M}}{SH}} + V} = {{\sqrt{\frac{\rho}{M}}{\sum\limits_{q = 1}^{Q}\quad{\left( {{\alpha_{q}A_{q}} + {j\quad\beta_{q}B_{q}}} \right)H}}} + V}}} & (8)\end{matrix}$in a more convenient form. We decompose the matrices in Eq. (8) intotheir real and imaginary parts to obtain $\begin{matrix}{{X_{R} + {j\quad X_{1}}} = {{\sqrt{\frac{\rho}{M}}{\sum\limits_{q = 1}^{Q}\quad{\left\lbrack {{\alpha_{q}\left( {A_{R,q} + {j\quad A_{I,q}}} \right)} + {j\quad{\beta\left( {B_{R,q} + {j\quad B_{I,q}}} \right)}}} \right\rbrack\left( {H_{R} + {j\quad H_{1}}} \right)}}} + V_{R} + {j\quad{V_{I}.}}}} & (9)\end{matrix}$

Denoting the columns of X_(R), X_(I), H_(R), H_(I), V_(R), and V_(I) byx_(R,n), x_(I,n), h_(R,n), h_(I,n), V_(R,n), and v_(I,n) where n=1, . .. , N, we form the single real system of equations $\begin{matrix}{{\underset{︸}{\begin{bmatrix}x_{R,1} \\x_{I,1} \\\vdots \\x_{R,N} \\x_{I,N}\end{bmatrix}} = {{\sqrt{\frac{\rho}{M}}\overset{\sim}{H}\underset{︸}{\begin{bmatrix}\alpha_{1} \\\beta_{1} \\\vdots \\\alpha_{Q} \\\beta_{Q}\end{bmatrix}}} + \underset{︸}{\begin{bmatrix}\upsilon_{R,1} \\\upsilon_{I,1} \\\vdots \\\upsilon_{R,N} \\\upsilon_{I,N}\end{bmatrix}}}},} & (10)\end{matrix}$where the equivalent 2NT×2Q real channel matrix is given by$\begin{matrix}{H = {\quad\begin{bmatrix}\begin{bmatrix}A_{R,1} & {- A_{I,1}} \\A_{I,1} & A_{R,1}\end{bmatrix} & \begin{bmatrix}h_{R,1} \\h_{I,1}\end{bmatrix} & \begin{bmatrix}{- B_{I,1}} & {- B_{R,1}} \\B_{R,1} & {- B_{I,1}}\end{bmatrix} & \begin{bmatrix}h_{R,1} \\h_{I,1}\end{bmatrix} & \ldots & \begin{bmatrix}A_{R,Q} & {- A_{I,Q}} \\A_{I,Q} & A_{R,Q}\end{bmatrix} & \begin{bmatrix}h_{R,1} \\h_{I,1}\end{bmatrix} & \begin{bmatrix}{- B_{I,Q}} & {- B_{R,Q}} \\B_{R,Q} & {- B_{I,Q}}\end{bmatrix} & \begin{bmatrix}h_{R,1} \\h_{I,1}\end{bmatrix} \\\vdots & \quad & \vdots & \quad & \ddots & \vdots & \quad & \vdots & \quad \\\begin{bmatrix}A_{R,1} & {- A_{I,1}} \\A_{I,1} & A_{R,1}\end{bmatrix} & \begin{bmatrix}h_{R,N} \\h_{I,N}\end{bmatrix} & \begin{bmatrix}{- B_{I,1}} & {- B_{R,1}} \\B_{R,1} & {- B_{I,1}}\end{bmatrix} & \begin{bmatrix}h_{R,N} \\h_{I,N}\end{bmatrix} & \ldots & \begin{bmatrix}A_{R,Q} & {- A_{I,Q}} \\A_{I,Q} & A_{R,Q}\end{bmatrix} & \begin{bmatrix}h_{R,N} \\h_{I,N}\end{bmatrix} & \begin{bmatrix}{- B_{I,Q}} & {- B_{R,Q}} \\B_{R,Q} & {- B_{I,Q}}\end{bmatrix} & \begin{bmatrix}h_{R,N} \\h_{I,N}\end{bmatrix}\end{bmatrix}}} & (11)\end{matrix}$

We now introduce the following definitions: $\begin{matrix}{{\underset{︸}{\begin{bmatrix}x_{R,1} \\x_{I,1} \\\vdots \\x_{R,N} \\x_{I,N}\end{bmatrix}}\overset{\Delta}{=}\overset{\sim}{x}};{{{\underset{︸}{\begin{bmatrix}\alpha_{1} \\\beta_{1} \\\vdots \\\alpha_{Q} \\\beta_{Q}\end{bmatrix}}\overset{\Delta}{=}\overset{\sim}{s}};\underset{︸}{\begin{bmatrix}\upsilon_{R,1} \\\upsilon_{I,1} \\\vdots \\\upsilon_{R,N} \\\upsilon_{I,N}\end{bmatrix}}}\overset{\Delta}{=}{v.}}} & (12)\end{matrix}$

We have a linear relation between the input and output vectors {tildeover (s)} and {tilde over (x)}, respectively: $\begin{matrix}{{\overset{\sim}{x} = {{\sqrt{\frac{\rho}{M}}\overset{\sim}{H}\quad\overset{\sim}{s}} + \upsilon}},} & (13)\end{matrix}$where the equivalent channel {tilde over (H)} is known to the receiverbecause the original channel H, and the dispersion matrices {A_(q),B_(q)} are all known to the receiver. (Those skilled in the art willappreciate that an equivalent treatment can be formulated in terms ofthe dispersion matrices {C_(q), D_(q)} in place of the matrices {A_(q),B_(q)}. The matrices {C_(q), D_(q)} are defined by Eq. (3), above.)

The receiver simply uses Eq. (11) to find the equivalent channel. Thesystem of equations between transmitter and receiver is not undeterminedas long asQ≦NT.

We may therefore use any decoding technique already known for use, e.g.,with V-BLAST, such as successive nulling and cancellation, its efficientsquare-root implementation, or sphere decoding. The most efficientimplementations of these schemes generally require O(Q³) computationsand have roughly constant complexity in the size of the signalconstellation r. Sphere decoding, which is an efficient species ofmaximum-likelihood decoding, will in at least some cases be particularlyadvantageous.

Design of the dispersion matrices. In a broad sense, the mutualinformation between the input vector {tilde over (s)} and the outputvector {tilde over (x)} is a measure of channel capacity as constrainedby our definition of the “equivalent channel,” and contingent on thechoice of dispersion matrices. When maximized, the mutual informationexpresses the maximum data rate achievable through the use of lineardispersion codes as described here, for given values of Q and T and forgiven numbers of transmit and receive antennas.

For purposes of the exemplary design method to be described below, wenow define the mutual information between the input vector {tilde over(s)} and the output vector {tilde over (x)} as${\frac{1}{2T}E\quad\log\quad\det\quad\left( {I_{2{NT}} + {\frac{\rho}{M}\overset{\sim}{H}\quad{\overset{\sim}{H}}^{T}}} \right)},$where E (.) denotes the statistical expected value, I_(2NT) is theidentity matrix of dimension 2NT, and {tilde over (H)}^(T) is thetranspose of the matrix {tilde over (H)}.

As a general practice, we find it useful to take Q=min(M,N)T since thistends to maximize the mutual information between {tilde over (s)} and{tilde over (x)} while still having some coding effects.

We choose {A_(q),B_(q)} to maximize the mutual information between{tilde over (s)} and {tilde over (x)}. We formalize the design criterionas follows.

-   1. Choose Q≦NT (typically, Q=min(M,N)T).-   2. Choose {A_(q), B_(q)} that solve the optimization problem    $\begin{matrix}    {{C_{LD}\left( {\rho,T,M,N} \right)} = {\max\limits_{A_{q},B_{q},{q = 1},{\ldots\quad Q}}{\frac{1}{2T}E\quad\log\quad{\det\left( {I_{2{NT}} + {\frac{\rho}{M}\overset{\sim}{H}{\overset{\sim}{H}}^{T}}} \right)}}}} & (14)    \end{matrix}$    subject to one of the following constraints    $\begin{matrix}    {{\sum\limits_{q = 1}^{Q}\quad\left( {{{tr}\quad A_{q}^{*}A_{q}} + {{tr}\quad B_{q}^{*}B_{q}}} \right)} = {2{TM}}} & (i) \\    {{{{tr}\quad A_{q}^{*}A_{q}} = {{{tr}\quad B_{q}^{*}B_{q}} = \frac{TM}{Q}}},{q = 1},{\ldots\quad Q}} & ({ii}) \\    {{{A_{q}^{*}A_{q}} = {{B_{q}^{*}B_{q}} = {\frac{T}{Q}I_{M}}}},{q = 1},{\ldots\quad Q}} & ({iii})    \end{matrix}$    where {tilde over (H)} is given by Eq. (11) with the entries of    h_(R,n) and h_(I,n) having independent N(0,½) entries. (In our    theoretical studies, we have assumed that the channel matrix H has    independent CN(0,1) entries. However, our mutual information    criterion is also readily applied for designing linear dispersion    codes appropriate to channels described by other statistical    distributions.)

The problem expressed by Eq. (14) can be solved subject to any of theconstraints (i)-(iii). Constraint (i) is simply the power constraint ofEq. (8) that ensures E tr SS*=TM. Constraint (ii) is more restrictiveand ensures that each of the transmitted signals α_(q) and β_(q) aretransmitted with the same overall power from the M antennas during the Tchannel uses. Finally, constraint (iii) is the most stringent, since itforces the signals α_(q) and β_(q) to be dispersed with equal energy inall spatial and temporal directions.

We have empirically found that of two codes with similar mutualinformations, the one satisfying the more stringent constraint performsbetter.

The constraints (i)-(iii) are convex in the dispersion matrices{A_(q),B_(q)}. However, the cost function$\frac{1}{2T}E\quad\log\quad\det\quad\left( {I_{2{NT}} + {\frac{\rho}{M}\overset{\sim}{H}\quad{\overset{\sim}{H}}^{T}}} \right)$is neither concave nor convex in the variables {A_(q),B_(q)}. Therefore,it is possible that Eq. (14) has local maxima. Nevertheless, we havebeen able to solve Eq. (14) with relative ease using gradient-basedmethods and it does not appear that local minima pose a great problem.

The block length T is essentially also a design variable. Although itmust be chosen shorter than the coherence time of the channel, it can bevaried to help the optimization of Eq. (14). We have found that choosingM≦T≦2M often yields good performance.

It should be noted that any code designed for a given number of receiveantennas is also readily used for a greater number of receive antennas.

With reference to FIG. 3, there is input at block 50 of that figure astatistical description of the propagation channel H. At block 60, theblock length T, the size M of the transmit array, and the size N of thereceiving array are provided to the processor. At block 65, the number Qis specified. At block 70, the optimization problem is solved todetermine the set of 2Q dispersion matrices that maximizes the mutualinformation. At block 75 the processor outputs the dispersion matricesand the equivalent channel matrix {tilde over (H)}. At block 80, acalculated value of the mutual information is output by the processor.

EXAMPLE

We will present an orthogonal design of block length T=4 for M=3transmit antennas, and will then compare the orthogonal design to alinear dispersion code for M=3 transmit antennas and N=1 receiveantennas. The orthogonal design is written in terms of {α_(q)} and{β_(q)} as $\begin{matrix}{S = {{\sqrt{\frac{4}{3}}\begin{bmatrix}{\alpha_{1} + {j\quad\beta_{1}}} & {\alpha_{2} + {j\quad\beta_{2}}} & {\alpha_{3} + {j\quad\beta_{3}}} \\{{- \alpha_{2}} + {j\quad\beta_{2}}} & {\alpha_{1} - {j\quad\beta_{1}}} & 0 \\{{- \alpha_{3}} + {j\quad\beta_{3}}} & 0 & {\alpha_{1} - {i\quad\beta_{1}}} \\0 & {{- \alpha_{3}} + {j\quad\beta_{3}}} & {\alpha_{2} - {i\quad\beta_{2}}}\end{bmatrix}}.}} & (15)\end{matrix}$

It turns out that this orthogonal design is also an LD code because, aswe have found, it is a solution to Eq. (14) for T=4 and Q=3. It achievesa mutual information of 5.13 bits/channel use at ρ=20 dB, whereas thechannel capacity is 6.41 bits/channel use.

To find a better LD code, we first observe that it is advantageous for Qto obey the constraint Q≦NT, with N=1 and T=4. Therefore Q≦4, and wechoose Q=4. After optimizing (14) using a gradient-based search, wefind: $\begin{matrix}{S = {\quad{\quad\begin{bmatrix}{\alpha_{1} + \alpha_{3} + {j\left\lbrack {\frac{\beta_{2} + \beta_{3}}{\sqrt{2}} + \beta_{4}} \right\rbrack}} & {\frac{\alpha_{2} - \alpha_{4}}{\sqrt{2}}{j\left\lbrack {\frac{\beta_{1}}{\sqrt{2}} + \frac{\beta_{2} - \beta_{3}}{2}} \right\rbrack}} & 0 \\{\frac{{- \alpha_{2}} + \alpha_{4}}{\sqrt{2}} - {j\left\lbrack {\frac{\beta_{1}}{\sqrt{2}} + \frac{\beta_{2} - \beta_{3}}{2}} \right\rbrack}} & {\alpha_{1} - {j\beta_{2}} + \frac{\beta_{3}}{\sqrt{2}}} & {{- \frac{\alpha_{2} + \alpha_{4}}{\sqrt{2}}} + {j\left\lbrack {\frac{\beta_{1}}{\sqrt{2}} - \frac{\beta_{2} - \beta_{3}}{2}} \right\rbrack}} \\0 & {\frac{\alpha_{2} + \alpha_{4}}{\sqrt{2}} + {j\left\lbrack {\frac{\beta_{1}}{\sqrt{2}} - \frac{\beta_{2} - \beta_{3}}{2}} \right\rbrack}} & {\alpha_{1} - \alpha_{3} + {j\left\lbrack {\frac{\beta_{2} + \beta_{3}}{\sqrt{2}} - \beta_{4}} \right\rbrack}} \\{\frac{\alpha_{2} - \alpha_{4}}{\sqrt{2}} + {j\left\lbrack {\frac{\beta_{1}}{\sqrt{2}} + \frac{\beta_{2} - \beta_{3}}{2}} \right\rbrack}} & {{- \alpha_{3}} + {j\quad\beta_{4}}} & {{- \frac{\alpha_{2} + \alpha_{4}}{\sqrt{2}}} + {j\left\lbrack {\frac{\beta_{1}}{\sqrt{2}} - \frac{\beta_{2} - \beta_{3}}{2}} \right\rbrack}}\end{bmatrix}}}} & (16)\end{matrix}$

This code has a mutual information of 6.25 bits/channel use at ρ=20 dB,which is most of the channel capacity. FIG. 4 compares the performanceof the orthogonal design of Eq. (15) with the LD code of Eq. (16) atrate R=6. (The rate of either code is (Q/T) log₂r; we achieve R=6 byhaving the orthogonal design send 256-QAM, and the LD code send 64-QAM.)The decoding in both cases is the efficient form of nulling/cancellingdescribed in U.S. patent application Ser. No. 09/438,900. We see fromFIG. 4 that the LD code performs uniformly better. It is worth notingthat the matrix S in Eq. (16) has orthogonal columns.

Mathematical Details

Normalization of the dispersion matrices. For purposes of theoreticalanalysis, we have assumed that the transmit signal S is normalized suchthat E tr SS*=TM. This induces the following normalization on thematrices {A_(a),B_(q)}: $\begin{matrix}{{\sum\limits_{q = 1}^{Q}\quad\left( {{{tr}\quad A_{q}^{*}A_{q}} + {{tr}\quad B_{q}^{*}B_{q}}} \right)} = {2{{TM}.}}} & (8)\end{matrix}$

Mathematical formulas for use in solving the optimization problem. Inthis section, we compute the gradient of the cost function of Eq. (14).To help compute this gradient, we rewrite the cost function in Eq. (14)as $\begin{matrix}{\frac{1}{2T}E\quad\log\quad{\det\left( {I_{2{NT}} + {\frac{\rho}{M}{\sum\limits_{q = 1}^{Q}{{{\begin{bmatrix}{\overset{\sim}{A}}_{q} & \cdots & 0 \\\vdots & \ddots & \vdots \\0 & \cdots & {\overset{\sim}{A}}_{q}\end{bmatrix}\begin{bmatrix}{\overset{\sim}{H}}_{1} \\\vdots \\{\overset{\sim}{H}}_{N}\end{bmatrix}}\begin{bmatrix}{\overset{\sim}{H}}_{1} \\\vdots \\{\overset{\sim}{H}}_{N}\end{bmatrix}}^{T}\begin{bmatrix}{\overset{\sim}{A}}_{q} & \cdots & 0 \\\vdots & \ddots & \vdots \\0 & \cdots & {\overset{\sim}{A}}_{q}\end{bmatrix}}^{T}}} + \left( {\overset{\sim}{B}}_{q}\leftarrow{\overset{\sim}{A}}_{q} \right)} \right)}} & (17)\end{matrix}$where for q=1, . . . , Q and n=1, . . . , N, we have defined$\begin{matrix}{{{\overset{\sim}{A}}_{q} = \begin{bmatrix}A_{R,q} & {- A_{I,q}} \\A_{I,q} & A_{R,q}\end{bmatrix}},{{\overset{\sim}{B}}_{q} = \begin{bmatrix}{- B_{I,q}} & {- B_{R,q}} \\B_{R,q} & {- B_{I,q}}\end{bmatrix}},{{\overset{\sim}{H}}_{n} = {\begin{bmatrix}h_{R,n} \\h_{I,n}\end{bmatrix}.}}} & (18)\end{matrix}$The subscript “R” denotes “real part”, and “I” denotes “imaginary part”.Define the matrix appearing in the log det(.) of Eq. (17) as Z. That is,$Z = {\left( {I_{2{NT}} + {\frac{\rho}{M}{\sum\limits_{q = 1}^{Q}{{{\begin{bmatrix}{\overset{\sim}{A}}_{q} & \cdots & 0 \\\vdots & \ddots & \vdots \\0 & \cdots & {\overset{\sim}{A}}_{q}\end{bmatrix}\begin{bmatrix}{\overset{\sim}{H}}_{1} \\\vdots \\{\overset{\sim}{H}}_{N}\end{bmatrix}}\begin{bmatrix}{\overset{\sim}{H}}_{1} \\\vdots \\{\overset{\sim}{H}}_{N}\end{bmatrix}}^{T}\begin{bmatrix}{\overset{\sim}{A}}_{q} & \cdots & 0 \\\vdots & \ddots & \vdots \\0 & \cdots & {\overset{\sim}{A}}_{q}\end{bmatrix}}^{T}}} + \left( {\overset{\sim}{B}}_{q}\leftarrow{\overset{\sim}{A}}_{q} \right)} \right).}$Define: $\begin{matrix}{{P_{q} = {E\left( {{{Z^{- 1}\begin{bmatrix}{\overset{\sim}{H}}_{1} \\\vdots \\{\overset{\sim}{H}}_{N}\end{bmatrix}}\left\lbrack {{\overset{\sim}{H}}_{1}^{T}\quad\cdots\quad{\overset{\sim}{H}}_{N}^{T}} \right\rbrack}\begin{bmatrix}{\overset{\sim}{A}}_{1} & \cdots & 0 \\\vdots & \ddots & \vdots \\0 & \cdots & {\overset{\sim}{A}}_{q}\end{bmatrix}} \right)}},} & (19) \\{R_{q} = {{E\left( {{{Z^{- 1}\begin{bmatrix}{\overset{\sim}{H}}_{1} \\\vdots \\{\overset{\sim}{H}}_{N}\end{bmatrix}}\left\lbrack {{\overset{\sim}{H}}_{1}^{T}\quad\cdots\quad{\overset{\sim}{H}}_{N}^{T}} \right\rbrack}\begin{bmatrix}{\overset{\sim}{B}}_{q} & \cdots & 0 \\\vdots & \ddots & \vdots \\0 & \cdots & {\overset{\sim}{B}}_{q}\end{bmatrix}} \right)}.}} & (20)\end{matrix}$

The gradients of the cost function $\begin{matrix}{\left\lbrack \frac{\partial{f\left( A_{R,q} \right)}}{\partial A_{R,q}} \right\rbrack_{ij} = {\frac{2\rho}{TM}{\sum\limits_{n = 1}^{N}\left( {P_{q,{i + {{({{2n} - 2})}T}},{j + {{({{2n} - 2})}M}}} + P_{q,{i + {{({{2n} - 1})}T}},{j + {{({{2n} - 1})}M}}}} \right)}}} & (21) \\{\left\lbrack \frac{\partial{f\left( A_{I,q} \right)}}{\partial A_{I,q}} \right\rbrack_{ij} = {\frac{2\rho}{TM}{\sum\limits_{n = 1}^{N}\left( {P_{q,{i + {{({{2n} - 1})}T}},{j + {{({{2n} - 2})}M}}} - P_{q,{i + {{({{2n} - 2})}T}},{j + {{({{2n} - 1})}M}}}} \right)}}} & (22) \\{\left\lbrack \frac{\partial{f\left( B_{R,q} \right)}}{\partial B_{R,q}} \right\rbrack_{ij} = {\frac{2\rho}{TM}{\sum\limits_{n = 1}^{N}\left( {R_{q,{i + {{({{2n} - 1})}T}},{j + {{({{2n} - 2})}M}}} - R_{q,{i + {{({{2n} - 2})}T}},{j + {{({{2n} - 1})}M}}}} \right)}}} & (23) \\{\left\lbrack \frac{\partial{f\left( B_{I,q} \right)}}{\partial B_{I,q}} \right\rbrack_{ij} = {\frac{{- 2}\rho}{TM}{\sum\limits_{n = 1}^{N}{\left( {R_{q,{i + {{({{2n} - 2})}T}},{j + {{({{2n} - 2})}M}}} + R_{q,{i + {{({{2n} - 1})}T}},{j + {{({{2n} - 1})}M}}}} \right).}}}} & (24)\end{matrix}$are given by: $f = {\frac{1}{2T}{E\quad \cdot \log}\quad\det\quad Z}$

1. A method of transmitting data over a wireless communication channel,wherein data are to be embodied in at least one signal distributed overplural transmit antennas, or over plural time intervals, or both,according to a space-time matrix, the method comprising: mapping a blockof data to a sequence of symbols, each symbol having a scalar value;defining each element of the space-time matrix as a weighted sum ofsymbols, wherein weights are assigned to each symbol according to arespective dispersion matrix for that symbol, the dispersion matrices donot lead to an orthogonal design, and at least one dispersion matrix iseffective for distributing its associated symbol over two or moretransmit antennas; and transmitting a signal according to the space-timematrix.
 2. The method of claim 1, wherein the symbols are selected froma symbol constellation.
 3. The method of claim 1, wherein the sequenceof symbols comprises complex-valued symbols selected from aconstellation and further comprises the complex conjugates of theselected symbols.
 4. The method of claim 1, wherein the sequence ofsymbols comprises real numbers whose amplitudes equal the real parts ofelements selected from a constellation, and imaginary numbers whoseamplitudes equal the imaginary parts of the selected constellationelements.
 5. The method of claim 1, wherein the dispersion matrices aredetermined by an optimization procedure directed at making the mostefficient use of the channel capacity.
 6. The method of claim 1, whereinthe dispersion matrices are determined by an optimization procedure thatseeks to maximize a measure of mutual information between transmittedand received signals.
 7. The method of claim 6, wherein the measure ofmutual information is derived, in part, from an effective channel matrix{tilde over (H)} that linearly relates a vector {tilde over (s)} of realand imaginary parts of selected symbols to a vector {tilde over (x)} ofreal and imaginary parts of signals received due to transmission of theselected symbols, the linear relationship having the form {tilde over(x)}=(normalizing factor)×{tilde over (H)}{tilde over (s)}+(additivenoise vector).
 8. The method of claim 7, wherein the measure of mutualinformation is proportional to a statistical expected value of log det$\left( {I_{2{NT}} + {\frac{\rho}{M}\overset{\sim}{H}{\overset{\sim}{H}}^{T}}} \right),$wherein M is the number of transmit antennas, there are N receivingantennas, there is a measured signal-to-noise ratio of ρ, the signal tobe transmitted is distributed over T time intervals, I_(2NT) is theidentity matrix of dimension 2NT, and {tilde over (H)}^(T) is theconjugate transpose of {tilde over (H)}.
 9. The method of claim 6,wherein the measure of mutual information is derived from an effectivechannel matrix {tilde over (H)}, each element of {tilde over (H)} is aweighted sum of real and imaginary parts of measured channelcoefficients, and in each said element, the weights are real andimaginary parts of elements of dispersion matrices.
 10. The method ofclaim 9, wherein the measure of mutual information is proportional to astatistical expected value of log det$\left( {I_{2{NT}} + {\frac{\rho}{M}\overset{\sim}{H}{\overset{\sim}{H}}^{T}}} \right),$wherein M is the number of transmit antennas, there are N receivingantennas, there is a measured signal-to-noise ratio of ρ, the signal tobe transmitted is distributed over T time intervals, I_(2NT) is theidentity matrix of dimension 2NT, and {tilde over (H)}^(T) is theconjugate transpose of {tilde over (H)}.